The Cmu - Mit Reverb Challenge 2014 System : Description and Results
نویسندگان
چکیده
To evaluate state-of-the-art algorithms and draw new insights regarding potential future research directions in distant speech recognition, Kinoshita et al. [1] launched the REverberant Voice Enhancement and Recognition Benchmark Challenge, commonly known as the REVERB Challenge, intended to provide a test bed for researchers to evaluate their methods based on common corpora and evaluation metrics. In this work, we describe our system and present our results on the 2014 REVERB Challenge (RC). Our system is comprised of four primary components: an acoustic speaker tracking system to determine the speaker’s position; this position is used for beamforming to focus on the desired speech while suppressing noise and reverberation; speaker clustering to determine sets of utterances spoken by the same speaker; and a speech recognition engine with speaker adaptation to extract word hypotheses from the enhanced waveforms produced by the beamformer. On the REAL RC evaluation data, our system obtained a word error rate of 39.9% with a single channel of the array, and 16.9% with the best beamformed signal.
منابع مشابه
Speech Dereverberation by Constrained and Regularized Multi-channel Spectral Decomposition: Evaluated on Reverb Challenge
We present our contribution to the REVERB Challenge in this paper. A multi-channel speech dereverberation system combines cross-channel cancellation and spectral decomposition. The reverberation is modeled as a convolution operation in the spectral domain. Using the generalized Kullback-Leibler (KL) divergence, we decompose the reverberant magnitude spectrum into clean magnitude spectrum convol...
متن کاملThe MERL/MELCO/TUM System for the REVERB Challenge Using Deep Recurrent Neural Network Feature Enhancement
This paper describes our joint submission to the REVERB Challenge, which calls for automatic speech recognition systems which are robust against varying room acoustics. Our approach uses deep recurrent neural network (DRNN) based feature enhancement in the log spectral domain as a single-channel front-end. The system is generalized to multi-channel audio by performing single-channel feature enh...
متن کاملLinear Prediction-based Dereverberation with Advanced Speech Enhancement and Recognition Technologies for the Reverb Challenge
This paper describes systems for the enhancement and recognition of distant speech recorded in reverberant rooms. Our speech enhancement (SE) system handles reverberation with blind deconvolution using linear filtering estimated by exploiting the temporal correlation of observed reverberant speech signals. Additional noise reduction is then performed using an MVDR beamformer and advanced model-...
متن کاملEnhancement of Reverberant and Noisy Speech by Extending Its Coherence
We introduce a novel speech enhancement algorithm for removing reverberation and noise from recorded speech data. Our approach centers around using a single-channel minimum mean-square error log-spectral amplitude (MMSELSA) estimator, which applies gain coefficients in a timefrequency domain to suppress noise and reverberation. The main contribution of this paper is that the enhancement is done...
متن کاملDARPA ATIS Test Results June 1990
Introduction The first Spoken Language System tests to be conducted in the DARPA Air Travel Information System (ATIS) domain took place during the period June 15 20, 1989. This paper presents a brief description of the test protocol, comparator software used for scoring results at NIST, test material selection process, and preliminary tabulation of the scored results for seven SLS systems from ...
متن کامل